Custom Vocabularies in Vale
Vale’s vocabulary system lets you manage terminology separately from styles, ensuring consistent language across documentation while allowing customization of third-party styles without modification. This structured approach benefits industries like technical writing, legal documentation, healthcare compliance, and software development.
How Vocabularies Work
Vocabularies consist of two plain-text files stored in a specific folder structure:
accept.txt
– Approved terms and phrases added to all styles listed inBasedOnStyles
.reject.txt
– Terms to flag as errors, automatically applied via theVale.Avoid
rule.
Benefits of Custom Vocabularies
✔ Enforce consistency – Ensure correct terminology usage.
✔ Prevent unwanted terms – Flag discouraged words.
✔ Simplify style management – Update vocabulary without altering third-party styles.
Storage Location:
<StylesPath>/config/vocabularies/<name>/
where <name>
is your vocabulary’s identifier. This is referenced in your .vale.ini
file.
📂 Organizing Your Vocabulary Files
Example Directory Structure:
styles/
├───MyStyle/
├───config/
│ └───vocabularies/
│ ├───DomainTerms/
│ │ ├───accept.txt
│ │ └───reject.txt
└───MyOtherStyle/
- Both
accept.txt
andreject.txt
list one entry per line. - Entries default to case-sensitive regular expressions.
🚀 Activating a Custom Vocabulary
To enable the DomainTerms
vocabulary, define StylesPath
and reference the vocabulary name in .vale.ini
:
# Path to styles directory
StylesPath = styles
# Enable the custom vocabulary
Vocab = DomainTerms
# Apply styles globally or to specific file types
[*]
BasedOnStyles = Vale, MyStyle
How It Works:
StylesPath = styles
– Specifies where custom styles and vocabularies reside.Vocab = DomainTerms
– ActivatesDomainTerms
, stored instyles/config/vocabularies/DomainTerms/
.BasedOnStyles = Vale, MyStyle
– Uses Vale’s default rules alongsideMyStyle
.
Once configured, Vale will:
- Enforce terms in
accept.txt
(e.g., standardized spelling and capitalization). - Flag terms in
reject.txt
as errors.
Testing the Configuration
Run Vale on a test document:
vale --config=.vale.ini your-document.md
If terms from reject.txt
appear or accept.txt
terms are misused, Vale will flag them.
✅ Approved Terminology (accept.txt
)
Defines standardized spellings, capitalization, and formats.
Example accept.txt
# Standard NLP Tokens
[PAD]
[UNK]
[CLS]
[SEP]
[MASK]
# Legal Terms
arbitration_clause
breach_of_contract
confidential_information
force_majeure
indemnification
intellectual_property
jurisdiction
non_disclosure_agreement
service_level_agreement
termination_for_convenience
# Medical Terms
clinical_trial
FDA_approval
HIPAA_compliance
informed_consent
medical_malpractice
patient_confidentiality
pharmacovigilance
telemedicine
# Technology Terms (Case-Insensitive Matching)
(?i)Artificial Intelligence
(?i)Blockchain
(?i)Cloud Computing
(?i)Cryptocurrency
(?i)Decentralized Finance
(?i)Internet of Things
(?i)Machine Learning
(?i)Neural Network
(?i)Quantum Computing
(?i)Smart Contract
(?i)Big Data
JavaScript
TypeScript
React
Node.js
API
CLI
GitHub
Markdown
MDX
SEO
reStructuredText
AsciiDoc
front matter
Hugo
VS Code
Visual Studio Code
command-line interface
application programming interface
toolset
backlink
How Vale Uses accept.txt
✔ Ensures consistency – Enforces standardized capitalization (e.g., always Blockchain
).
✔ Prevents variations – Mandates non_disclosure_agreement
over alternative spellings.
❌ Prohibited Terminology (reject.txt
)
Flags incorrect, outdated, or inconsistent terms.
Example reject.txt
# Incorrect or discouraged legal terms
[Nn]on[- ]?disclosure[- ]?agreement
[iI]ntellectual[- ]?property[- ]?rights
[Ff]orce[- ]?majeure[- ]?clause
# Incorrect or inconsistent medical terms
[Pp]atient[- ]?data[- ]?privacy
[Ff]ederal[- ]?Drug[- ]?Administration
# Common technology misuses
[Bb]lock chain
[Cc]rypto[- ]?currency
[Ii]nternet[- ]?of[- ]?things
[Aa]rtifical[- ]?intelligence
[Qq]uantum[- ]?computing[- ]?algorithm
Javascript
Typescript
ReactJS
NodeJS
Github
MarkDown
reST
Asciidoc
Frontmatter
VSCode
How Vale Uses reject.txt
❌ Flags incorrect terms (e.g., crypto currency
triggers a correction to Cryptocurrency
).
❌ Prevents outdated terms (e.g., replacing patient data privacy
with patient_confidentiality
).
❌ Catches variations (e.g., [Nn]on[- ]?disclosure[- ]?agreement
detects "Non Disclosure Agreement"
, "non-disclosure-agreement"
, etc.).
Handling Case Sensitivity
Vale enforces case sensitivity by default. To allow case-insensitive matches, use:
(?i)MongoDB
[Oo]bservability
Alternatively, disable Vale.Terms
to rely on Vale.Spelling
for traditional spell-checking:
[*.md]
BasedOnStyles = Vale
Vale.Terms = NO
Advanced: Targeting Vocabulary Entries
To override an ignored token, set vocab: false
in a custom rule:
extends: existence
message: Did you mean '%s'?
vocab: false
tokens:
- MongoDB
This ensures MongoDB
is flagged even if it's in accept.txt
.
🛠️ Best Practices for Managing Vale Vocabulary
✔ Organize by Category – Separate Legal, Medical, and Tech terms in accept.txt
and reject.txt
.
✔ Use Regex for Flexibility – Handle variations ([Qq]uantum[- ]?computing
matches multiple formats).
✔ Maintain a Shared Vocabulary – Store a single vocabulary folder across styles.
✔ Keep it Up to Date – Regularly review and refine terminology.
✔ Test with Vale – Run Vale’s command-line tool to validate enforcement.
vale --config=.vale.ini your-document.md
🚀 Why Use Vale’s Custom Vocabulary?
✅ Ensures Consistency – Standardizes terminology across teams.
✅ Reduces Editing Time – Automates error detection.
✅ Highly Customizable – Supports regex-based rules.
✅ Integrates with CI/CD – Works with GitHub Actions, GitLab CI/CD, Jenkins.
✅ Improves Compliance – Aligns with industry standards.
By leveraging Vale’s vocabulary system, you maintain clear, professional, and consistent documentation while automating quality checks. 🚀